On Not Making Dissimilarities Euclidean
نویسندگان
چکیده
Non-metric dissimilarity measures may arise in practice e.g. when objects represented by sensory measurements or by structural descriptions are compared. It is an open issue whether such non-metric measures should be corrected in some way to be metric or even Euclidean. The reason for such corrections is the fact that pairwise metric distances are interpreted in metric spaces, while Euclidean distances can be embedded into Euclidean spaces. Hence, traditional learning methods can be used. The k-nearest neighbor rule is usually applied to dissimilarities. In our earlier study [12,13], we proposed some alternative approaches to general dissimilarity representations (DRs). They rely either on an embedding to a pseudo-Euclidean space and building classifiers there or on constructing classifiers on the representation directly. In this paper, we investigate ways of correcting DRs to make them more Euclidean (metric) either by adding a proper constant or by some concave transformations. Classification experiments conducted on five dissimilarity data sets indicate that non-metric dissimilarity measures can be more beneficial than their corrected Euclidean or metric counterparts. The discriminating power of the measure itself is more important than its Euclidean (or metric) properties.
منابع مشابه
Non-Euclidean Dissimilarities: Causes, Embedding and Informativeness
In many pattern recognition applications object structure is essential for the discrimination purpose. In such cases researchers often use recognition schemes based on template matching which lead to the design of non-Euclidean dissimilarity measures. A vector space derived from the embedding of the dissimilarities is desirable in order to use general classifiers. An isometric embedding of the ...
متن کاملRicci flow embedding for rectifying non-Euclidean dissimilarity data
Pairwise dissimilarity representations are frequently used as an alternative to feature vectors in pattern recognition. One of the problems encountered in the analysis of such data, is that the dissimilarities are rarely Euclidean, while statistical learning algorithms often rely on Euclidean dissimilarities. Such non-Euclidean dissimilarities are often corrected or a consistent Euclidean geome...
متن کاملRelational Generative Topographic Map
The generative topographic mapping (GTM) has been proposed as a statistical model to represent high dimensional data by means of a sparse lattice of points in latent space, such that visualization, compression, and data inspection become possible. Original GTM is restricted to Euclidean data points in a vector space. Often, data are not explicitly embedded in a Euclidean vector space, rather pa...
متن کاملتبیین الگوی نااقلیدسی در برنامه ریزی شهری
With domination of Kant's epistemology and instrumental reason in social science and human geography, interpretation of space have been based on neo physics that often it is equivalent with intuitive and physical experience and the place of capital and it's reproduction. Therefore we firstly have represented of ontological transform of space concept and by the way we enumerate the c...
متن کاملNon-Euclidean Dissimilarities: Causes and Informativeness
In the process of designing pattern recognition systems one may choose a representation based on pairwise dissimilarities between objects. This is especially appealing when a set of discriminative features is difficult to find. Various classification systems have been studied for such a dissimilarity representation: the direct use of the nearest neighbor rule, the postulation of a dissimilarity...
متن کامل